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I Abstract 

(N 

Compressive sensing (CS) is a sampling technique designed for reducing the complexity of sparse 
data acquisition. One of the major obstacles for practical deployment of CS techniques is the 
signal reconstruction time and the high storage cost of random sensing matrices. We propose 
a new structured compressive sensing scheme, based on codes of graphs, that allows for a joint 
design of structured sensing matrices and logarithmic-complexity reconstruction algorithms. The 
compressive sensing matrices can be shown to offer asymptotically optimal performance when used 
tyj I in combination with Orthogonal Matching Pursuit (OMP) methods. For more elaborate greedy 

reconstruction schemes, we propose a new family of list decoding belief propagation algorithms, as 
well as reinforced- and multiple-basis belief propagation algorithms. Our simulation results indicate 
^ ' that reinforced BP CS schemes offer very good complexity-performance tradeoffs for very sparse 

52 ' signal vectors. 

m 

' Keywords: Belief Propagation, Compressive sensing, Low-Density Parity-Check codes. Orthogonal 

Matching Pursuit, restricted isometry constants, sparse approximation, Subspace Pursuit. 

1 Introduction 

Compressive sensing (CS) has received significant attention due to their various applications in signal 
processing, networking, MRI data acquisition, bioinformatics, and remote sensing [Ij. CS is a sampling 
technique for compressible and/or /C-sparse signals, i.e., signals that can be represented hy K <^ N 
significant coefficients over an A^-dimensional basis. Sampling of a X-sparse, discrete-time signal x of 
dimension N is accomplished by computing a measurement vector, y, that consists oi m N linear 
projections, i.e., 

y = *x. 

Here, $ represents an m x matrix, usually over the field of real numbers [2j. 

Although the reconstruction of the signal x € from the possibly noisy random projections is an 
ill-posed task, the prior knowledge of signal sparsity allows for recovering x in polynomial time using 
m <^ N observations only. If the reconstruction problem is cast as an Iq minimization problem [3], it 
can be shown that in order to reconstruct a iC-sparse signal x, Iq minimization requires only m = 2K 
random projections. In this setting, it is assumed that the signal and the measurements are noise-free. 
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Unfortunately, the optimization problem is a combinatorial problem that for general instances of 
sensing is NP-hard. 

The work by Donoho and Candes et. al. [HEHHlS] demonstrated that CS reconstruction is a polynomial 
time problem - conditioned on the constraint that more than 2K measurements are used. The key 
idea behind their approach is that it is not necessary to resort to Iq optimization to recover x from 
the under-determined inverse problem: a much easier ii optimization, based on Linear Programming 
(LP) techniques, yields an equivalent solution provided that the sensing matrix $ satisfies the so called 
restricted isometry property (RIP), with a constant RIP parameter. 

While LP techniques play an important role in designing computationally tractable CS decoders, their 
complexity renders them highly impractical for many applications. In such cases, the need for fast 
reconstruction algorithms - preferably operating in time linear in A^, and without significant perfor- 
mance loss compared to LP methods - is of critical importance. A common approach to mitigating 
these problems is to increase the number of measurements and to use greedy reconstruction methods. 
Several classes of low-complexity reconstruction techniques were put forward as alternatives to linear 
programming (LP) recovery, including group testing methods [6], pursuit strategies such as Orthog- 
onal Matching Pursuit (OMP), Subspace Pursuit (SP) and Compressive Sampling Matching Pursuit 
(CoSaMP) [7H10]. and coding-theoretic techniques [TTH13] . 

We focus our attention on two intertwined problems related to low-complexity CS reconstruction tech- 
niques. The first problem is concerned with designing structured matrices that provide RIP-type perfor- 
mance guarantees, since such matrices have low storage complexity and may potentially yield to faster 
reconstruction approaches. The second problem is concerned with how to most efficiently exploit the 
structure of the sensing matrix in order to further reduce the reconstruction complexity of greedy-like 
methods that use correlation maximization as one of their key steps. The solution we propose addresses 
both issues, and can be succinctly described as follows. 

It is known that random Bernoulli matrices - matrices with i.i.d. Bernoulli (1/2) distributed entries - 
have constant RIP parameters with a number of measurements proportional to K log{N / K) [Tl[5]. This 
number of measurements suffices for exact reconstruction of i^-sparse signals using LP methods. One 
property of Bernoulli matrices is that, for sufficiently large dimensions, the fraction of the symbols +1 
and —1 per row and per column is close to one half. Furthermore, a similar property holds for any 
sufficiently large submatrix of the matrix. One approach to designing structured compressive sensing 
matrices would be to try mimicking this property of Bernoulli matrices and then showing that the 
matrices indeed have a constant RIP parameter. 

This task can be accomplished via linear error-correcting coding. Due to the linear structure of the 
code, using codewords of a binary linear code with zeros replaced by +l's and ones by — I's as columns 
of the matrix ensures the row-weight balancing property. Furthermore, if the weight of the codewords 
is chosen close to half of the codelength, similar concentration results will hold for the columns of the 
sensing matrix. 

The idea of using linear error-correcting codes was first proposed in [14J . where encodings of Reed- 
Muller codewords were used for columns of a compressive sensing matrix |15U16) . The authors proposed 
independently a similar framework based on low-density parity-check codes [l7j in [18] , and some follow- 
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up results on this work were reported in fl9]. Another approach for constructing sensing matrices by 
trying to match their distribution of singular values to that of Bernoulli matrices was put forward 
in [20IEI]. 

The advantage of using sensing matrices based on error-control codes from the perspective of recon- 
struction complexity is best explained in the context of greedy algorithms, as argued in our earlier 
work [18j. A key step of greedy reconstruction algorithms is to compute the correlations of the observed 
vector y with the columns of the sensing matrix and to identify the column with the largest cor- 
relation. When the columns of the matrix represent codewords of a linear code, this problem reduces 
to the extensively studied maximum likelihood (ML) decoding problem. For certain classes of codes, 
near-ML decoding can be performed in time linear in the length of the code, which in the described 
setting implies that near-optimal correlation optimization can be performed in time proportional to the 
number of rows, and not the number of columns of the sensing matrix. 

We focus on codes that lead to reconstruction techniques with sublinear - more precisely - logarithmic 
complexity in A^. The basic construction and decoding methods are based on ideas from codes on graphs 
and iterative decoding. We show how a simple combination of reinforced belief propagation (BP) [22\ 
and a novel list decoding method can be coupled with the greedy SP algorithm to produce good 
reconstruction algorithms with logarithmic complexity, for the case of "super-sparse" signals previously 
studied in |23| . As already mentioned, the BP algorithm operates on the columns of the matrix $ of 
length m, and consequently, its reconstruction complexity is O (m). 

Before outlying the organization of the paper, we would like to describe the context of our work within 
the vast literature on compressive sensing. Sublinear reconstruction techniques were first investigated 
in |23fl26) . while sparse sensing matrices coupled with BP decoding were considered in |1HI25). An 
idea for sublinear compressive sensing reconstruction inspired by Sudoku was described in [24] , but 
the algorithm works only for input signals with special structural properties where one requires that all 
sums of subsets of coefficients are distinguishable (which is rather restrictive for binary vectors), and 
where the measurement matrix is random. Furthermore, the reconstruction is only partial, in so far 
that the reconstruction complexity strongly depends on the number of recovered entries of the sensed 
vector. 

Our approach differs from all the aforementioned results in so far that it does not use sparse sensing 
matrices that are known to incur a performance loss compared to dense matrices, such as Bernoulli 
matrices. Although our structured sensing matrices are dense, they are constructed using codewords 
of large minimum distance LDPC codes which themselves have sparse matrix descriptions (i.e., sparse 
parity-check matrices). Furthermore, no high-complexity pre-processing is required and unlike the 
approach in [23], the complexity of the algorithm is not poly log in A^, but only logarithmic in A^; 
and, as opposed to using sparse matrices without RIP guarantees, our approach utilizes structured 
dense matrices constructed from sparse matrices that are asymptotically optimal with respect to the 
achievable coherence parameters. 

The problems addressed in this paper are equally relevant to questions arising in storage and wireless 
communication systems, since a major part of analysis is focused on BP decoding for channels with 
severe user interference. The framework proposed in this paper also allows for handling measurement 
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noise, but the underlying results will be described elsewhere. 

The paper is organized as follows. Section [2] provides a brief introduction to compressive sensing. 
Section [3] includes the description of a structured design approach for compressive sensing matrices 
amenable for 0(-R'"logA^) complexity decoding of super-sparse vectors, with u = 2 or u = 3. This 
section also contains the main analytical results of the paper. Section 3] describes a new "biased list 
decoding" framework for BP algorithms, a new CS-oriented reinforced BP algorithm, as well as the 
description of multiple-basis belief propagation algorithm for CS reconstruction. Section [5] presents the 
simulation results, while Section [6] contains our concluding remarks. 

2 Compressive Sensing and the Restricted Isometry Property 

Let X be an A^-dimensional real- valued signal with at most K non-zero components, henceforth called 
a i^T-sparse signal. Let supp(x) denote the set of indices of the non-zero coordinates of the vector 
X = (xi, . . . ,xn), and let |supp(x)| = || • ||o denote the support size of x, or equivalently, its io norm. 
Assume next that x G is an unknown signal with |supp(x)| < K, and assume that y G is an 
observation of x generated via m linear measurements, i.e., y = $x, where $ G ^rnxN -g j-gfgj-i-gfi iq a,s 
the sensing matrix. 

We are concerned with the problem of low-complexity recovery of the unknown signal x from the 
measurement y for the case that K « N. A natural formulation of the recovery problem is within an 
Iq norm minimization framework, which seeks a solution to the problem 

min ||x||q subject to y = $x. 

Unfortunately, the above minimization problem is NP-hard, and hence cannot be used for practical 
applications 

One way to avoid using this computationally intractable formulation is to consider a ii optimization 
problem, 

min ||x||^ subject to y = $x, 

where 

N 

ll^lll = ^ 

denotes the ii norm of the vector x. 

The main advantage of the £i minimization approach is that it is a convex optimization problem that 
can be solved efficiently via linear programming (LP) techniques. This method is therefore frequently 
referred to as ^i-LP reconstruction I4j(27j, and its reconstruction complexity equals O {im?N^f'^^ for 
small m, and O (-/V^) for large m. The method is based on interior point LP solvers |28) . 
The reconstruction accuracy of the ^i-LP method is described by the restricted isometry property (RIP), 
formally defined below. 

Definition 1 Let ^ ^ W^^^ , x G and / C {!,••• ,iV}. Also, let the matrix */ consist of the 
columns of ^ indexed by i €z I; similarly, let x/ denote a vector composed of the entries ofx indexed by 
the same set I. The space spanned by the columns of^j is denoted by span($/). 
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Definition 2 A matrix $ G ^mxN ^^^^ satisfy the Restricted Isometry Property (RIP) with 
parameters {K,S) for K < m, < S < 1, if for all index sets I C {1, • • • , A^} such that \I\ < K, and 
for all q G M'^', one has 

(l-5)||q||^<||*,q||^<(l + 5)||q||2. (1) 

We define 6k, the RIP constant, as the infimum of all parameters 6 for which the RIP holds, i.e. 

(5i^:=inf jj: {1 - 6) < \\^iq_\\l < {1 + 6) \\q\\l , V |/| < A', Vq G RI^I } . (2) 

Remark 1 Most known families of matrices satisfying the RIP property with optimal or near-optimal 
performance guarantees are random f27l \29 ^ . Note that the storage complexity of such random matrices 
satisfying the RIP property is very large. Alternative sensing matrix design methods rely on structured 
designs which may mitigate this problem (see fl5\\lb\[^\!^(Mcl<!l/ ). 

There exists an important connection between the LP reconstruction accuracy and the RIP property, 
first described by Candes and Tao in [4j. If the sampling matrix $ satisfies the RIP with constants 6k, 
62K, and 63K, such that 

Sk + S2K + SsK < 1, (3) 
then the £i-LP algorithm will reconstruct all A-sparse signals exactly. 

An alternative to li methods is the family of greedy algorithms, including OMP, Regularized OMP, 
Stagewise OMP, SP and CoSaMP algorithms [7H9tl33|IM]. The basic idea behind these methods is to 
find the support of the unknown signal sequentially. At each iteration of the algorithms, one or several 
coordinates of the vector x are selected for testing, based on the correlation magnitudes between the 
columns of $ and the regularized measurement vector. If deemed sufficiently reliable, the candidate 
column indices are used for the current estimate of the support set of x. The greedy algorithms iterate 
this procedure until all the coordinates in the correct support set are included in the estimated support 
set or until reconstruction failure is declared. 

OMP techniques were recently extended in a manner that allows for adaptively adding or removing 
sets of column candidates from the estimated list of columns. One algorithm that uses this idea is SP 
algorithm (for more details regarding the SP algorithm, the interested reader is referred to ^). Here, 
yl = resid (y, ^xe) denotes the residual vector of the projection of vector y onto the subspace spanned 
by ^j'l. For completeness, the fiow-charts of the OMP and SP algorithms are given in Table 1 and 
Table 2, respectively. Note that the columns of $ are denoted by (fii, . . . ,(p]\f. 

The computational complexity of OMP strategies depends on the number of iterations needed for 
exact reconstruction: standard OMP always runs through AT iterations, and therefore its reconstruction 
complexity is roughly O (KmN) operations. The same is true of the SP algorithm, except that for 
some classes of sensing vectors the complexity can be brought down to O (mN log K) . 
An important open question in CS theory is how to devise structured low-dimensional sensing matrices 
that can be decoded with vey low complexity algorithms that exploit the structural properties of 
the matrices. Restricting the choices for the sensing matrix to a special class of matrices necessarily 
introduces a performance loss, and one would like to investigate the trade-off between the performance 
loss and complexity of such reconstruction algorithms. Our results pertaining to these questions are 
presented in the following sections. 
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Table 1: The OMP Algorithm 
Input: y, $ and an error threshold eq 
Initialization: Initialize k = and set: 

• The initial solution to x*^ = 0. 

• The residual, resid (y, to r*^ = y — ^x" = y. 

• The initial support set to = supp (x") = 0. 

Main iteration: Increase A; by 1 and perform the following steps 

• Compute the errors e(j) = min^^ \\zj (fj — r^^-'^H^, where (pj denotes the j-th column of $ and 
Zj G M. 

• Update the support: Find a "minimizer", Jq, of e{j) : j ^ 5'^~^,e(jo) < e(i), and update S'' = 

• Compute x'^, the "minimizer" of ||$x — yUg subject to supp (x) = S^. 

• Update residual: Compute r'^ = y — ^x'^. 

• Stopping rule: If ||r^||2 < 7 stop. Otherwise, run another iteration. 
Output: The solution is x^, obtained after K iterations. 

3 Compressive Sensing Using LDPC Codes 

A binary linear block code with parameters [m, s,drain], C, is a s-dimensional subspace of an m- 
dimensional vector space over the finite field F2. Less formally, a code is a collection of codewords 
of length m that encode s information bits using m — s parity-check bits. The parameter dmin is the 
minimum distance of the code, defined as the smallest Hamming distance (total coordinate distance) 
between any pair of distinct codewords. The code rate is defined as i? = s/m0. 

A set of s basis-vectors of the subspace, arranged row-wise, forms a generator matrix of the code, 
denoted by G. A set of m — s basis-vectors of the null-space of C, arranged row- wise, forms a parity- 
check matrix of the code, denoted by H. Clearly, c G C iff He = 0. A low-density parity-check (LDPC) 
code is a linear block code with at least one "sparse" parity-check matrix |17| . The word "sparse" is 
given different meanings in different contexts. In this work, "sparse" refers to the property that every 
column or row of parity-check matrix has at most Wc or Wr non-zero entries, respectively, where Wc and 
Wr are constants that do not depend on m or s. 

^Note that our notation does not follow the standards in coding literature when it comes to denoting the codelength and 
dimension. We use this notation to prevent confusion with the standard notation used in compressive sensing literature. 
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Input: K, y 
Initialization: 



Table 2: The SP Algorithm 



1. = {K indices corresponding to the largest magnitude entries in the vector $*y}, where 
denotes the transpose of ^. 

2. y° = resid(y,*j.o). 

Iteration: At the i^^ iteration, go through the following steps 

1. = T^~^[J{K indices corresponding to the largest magnitude entries in the vector $*y^~^}. 

2. Set Xp = *t^y. 

3. = {K indices corresponding to the largest elements of Xp}. 

4. yf = resid(y,*^«) . 

5. If ||yf II2 > ||yr^^|l2' -'^^ ~ quit the iteration. 
Output: 

1. The estimated signal x, satisfying Xj^^ ... tvI-T*" = and x^^f = ^JpiY- 
3.1 Sensing Matrix Construction 

Consider a [m, s, dmin] LDPC code C that does not contain the all-ones codeword. We construct a 
m X {N = 2^* — 1) sensing matrix $ in the following mannei§|. 

First, we convert all non-zero codewords of C into their Binary Phase Shift Keying (BPSK) images, 
defined via the mapping — t- — 1 and — t- -|-1. Subsequently, we normalize each image by 1/y/m. 
These normalized codewords are used as columns of the matrix The columns of $ are, as before, 
denoted by (p^, . . . , (fj^, while Ci, . . . , cjy are used to denote their corresponding codewords. Since LDPC 
codes are linear codes, one can arrange the columns of $ lexicographically, so that locating one column 
of $ requires the order of 0(log A^) operations. 

The crux behind choosing a sensing matrix of this form is as follows: one of the most expensive steps 
in OMP and SP reconstruction is correlation maximization. Choosing $ to be composed of codewords 
of a binary linear code allows one to recast the problem of correlation maximization as the problem 
of finding the maximum likelihood (ML) codeword. Furthermore, due to the fact that LDPC codes 
are linear, each row of $ has A^/2 negative and A^/2 — 1 positive entries. To ensure that the matrix 

^Note that the proposed construction apparently imposes a restriction on the size of the sensing matrix in so far that 
the number of columns TV has to be one less than a power of two. But this restriction is not a strict one: it is possible to 
leave out as many codewords/columns of the matrix as needed to bring it to the size required by the application at hand. 
None of the performance guarantees are affected by this modification. 
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$ "globally" mimics the properties of a random Bernoulli matrix, one also needs to ensure that the 
columns of $ (codewords of the code) have weight close to m/2. Henceforth, we refer to matrices that 
have such properties as "Bernoulli- like" matrices. One has to ensure that these properties hold locally 
as well, which can be achieved by enforcing similar constraints on subcodes of the code. 
With respect to the first problem, we show that LDPC code with sharply concentrated distance spectra 
around m/2 exist. Regarding the second issue, we demonstrate that support weight enumerators |35] 
can be used to characterize the "local Bernoulli structure" of the matrix 

As before, let x G be a -sparse signal, and let y G R™ be the CS observation vector. The 
maximum correlation between distinct columns of the matrix is denoted by 

= max|(</3i, 97-) I, 

where 

m 
1=1 

The parameter is called the coherence parameter of the matrix There exists a fundamental lower 
bound on the value of the coherence parameter, 

c 



where c denotes a constant (see |36| and references therein). 

Most performance guarantees of OMP algorithms are expressed in terms of this parameter. The standard 
OMP algorithm guarantees exact recovery of the vector x as long as /i < 1/2K |37| . Such a bound, as 
we show later, holds for sensing "Bernoulli- like" matrices based on LDPC codes. 

It is straightforward to express the coherence parameter of the code-based matrix $ in terms of the 
Hamming distance dn between codewords, as shown below: 

{v>i, v^j) = 2^^i,i^j,i= 2^ —+ ( — ) = i 

J z — i ^ Yn mm 

_ ^ _ 2dYi{ci,Cj) 
m 

Here, di{{ci,Cj) denotes the Hamming distance between the codewords a and cj. Note that if for all 

dnici^Cj) 1 ]_ ^ ^ m - 2dH{ci,Cj) J_ 

m 2 AK' m 2K' 

and if for all i ^ j, 

duici^Cj) 11 m - 2dH{ci,Cj) _ 1 

m < 2 + m > 27^' 

Consequently, it holds that 

dnici.Ci) f 1 11 1 \ , , , , 1 

Hence, to guarantee exact recovery with the OMP algorithm for all i^T-sparse signals, we need to identify 
LDPC codes with 

1_J_ 1^ 

2 AK m 2 AK' 
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That such LDPC codes indeed exist is shown in the proposition below. In the proof, we use the following 
ensemble of codes from |38) . 

Ensemble E: The parity-check matrix H of the code is chosen with uniform probability from the 
ensemble of (m — s) x m (0,l)-matrices with row sums equal to Wr- Such codes are referred to as 
row-regular codes. 

Proposition 1 Consider an LDPC code with codewords q, £ = 1, . . . ,N , from the Ensemble E with 
odd Wr > 3. Let K G Z"*" and let K,m, N = 2* go to infinity with m > cK'^logN for some constant c. 
Then, with probability at least 1 — e~"^'^ , for some constant > 0, one has 

d^(c^ 1 J_ 
2 4K m 2 AK' ^ ' 



Remark 2 For even values of Wr, since both all-zero and all-one codewords are in the code, ^ is 
obviously not satisfied. In this case, one needs to expurgate all the codewords starting with the symbol 
(or, alternatively, starting with the symbol 1). This does not, however, change the asymptotic formula 
for the rate of the code. Also, note that in this case, the number of rows of the sensing matrix is K -times 
larger than the smallest number needed for LP reconstruction methods. 

To prove Proposition [H we introduce the following terminology. 

Let Cm be an ensemble of codes of length m defined by parity-check matrices of size (ttt, — s) x m. For 
a code C E Cm, let 

BiiC) = |{c G C : wt(c) = i]\, i = 0,1,... ,m, 
where wt(-) denotes the Hamming weight. The average ensemble distance distribution is 

B{Cm) = {Bo{Cm), Bi{Cm.), • • • , Bm{Cm)), 

where 

Bi{Cm) ■= Bi = —— ^ Bi{C). 
We use the following theorem from [38]. 

Theorem 1 Let a := {m — s)/s = 1 — R. For 9 £ (0,1), the elements of the average distance distribution 
are of the form 

bg:= lim -lnBem = H{e)+p^, 
where H{0) denotes Shannon's entropy function. For ensemble E, one has the following formula 

Pe = 

Proof of Proposition 1: Following the approach of [38j, we need to prove that with high probability, 
bg < for any ^ — j^, ^ + 47^)- If bg < 0, the average number of codewords of weight Om in the 
ensemble goes to zero as the length of the codes increases. 
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Let 6 = \ — ^^'^ choose e > such that 9 = Then 

H{e) = log2-l^log(l-e)-ii^log(l + 6) = log2-i^(-e + | + o(e2) 

1 + e / / 2^^ , , 2^ 

-^[^+J + o{^'))=log2- — + o{e'). 

With 61" = i - 3^ or e~ = dSJ becomes H {0-) = log2 - + o {^) . Note that H (9) < 
H{0-),y9 < 9-. Then, for a = 1 - i?, 

6e < be-=H{9-)+p'^.=log2-^+o(^-^^-{l-R)log2 

+ (l-i^)log(l+(^)^3=i?log2-^ + o(^)+o(^). 

Similarly, one can prove that V6' > 0^ = | + 

5,<V=/ilog2-^ + o(^)+o(^). 

Therefore, if < -^g]^, or equivalently, if m > (8/3) log (2) K'^logN, then yO ^ - ^ + ^) , 
bg < 0, when K, m, N go to infinity. This proves that the average ensemble's relative distance lies within 
the interval — ^ + ^) with probability at least 1 — e~"^^, for some constant u > 0. Therefore, 
according to the probabilistic method, there exists at least one code which exceeds the average, and 
this proves the claimed results. 

Remark 3 Proposition 1 suggests that when m > cK'^logN, our construction ensures that the max- 
imum correlation between distinct columns is small, i.e., /i < l/2i^. In fact, the following proposition 
about random Bernoulli sensing matrix shows that our LDPC based construction is asymptotically as 
good as the random Bernoulli counterpart. 

Proposition 2 Let X G j^mxA^ y^f^gg^ entries are randomly drawn from the Bernoulli distribution 
with parameter p = 1/2. Let Xi be the i^^ column of the matrix X. Let K,m,N approach infinity 
simultaneously. Then if and only if m> cK^ logN for some constant c > 0, one has max— \ {xi, Xj)\ < 

2^ with probability close to one. 

Proof of Proposition 2 See Appendix 16. 11 

We can now bound the RIP parameter of the code-based matrix Applying the Gershgorin circle 
theorem [39] to the matrix A = shows that all the eigenvalues of A lie in a disc D (1, r) centered 

at one and with radius r, where 

r = ma.x^\{(fi,^j)\, 
< Kii. 

Therefore, every eigenvalue A of A satisfies 

1 - K/i < A < l + K^i. 
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Hence, if /it < 1/2K, it is easy to see that the RIP parameter satisfies 

5k < 1/2. 

Alternatively, one can also easily show a more general result that any matrix with coherence parameter 
fi and sparsity parameter K satisfies the RIP with constant 6 = (K — l)/x. For the matrices based on 
LDPC codes, this result also implies S < 1/2 

3.2 Higher Hamming Weights and the RIP 

Henceforth, wc consider two classes of sensing vectors x. The first class, referred to as binary sensing 
vectors, has the property that all non-zero entries of the vector are equal to one. The second class 
has the property that the non-zero entries are drawn independently, following a standard Gaussian 
distribution. Such vectors will be called Gaussian sensing vectors. 

We describe next an interesting connection between code invariants, known as higher weights and 
support weight distributions, and the RIP property restricted to binary vectors x only. Furthermore, 
this example illustrates that local Bernoulli-like matrix properties depend on the support weights of the 
LDPC code used to construct 

Consider four codewords of a linear code of length m = 6 and their BPSK images, shown below 

00000 0, 11110 0, 00110 0, 11000 0, 

+ 1+1+1+1+1+1,-1-1-1-1+1+1, 

+ 1+1-1-1+1+1,-1-1+1+1+1+1. 

Together, these four codewords form a linear code that may represent a subcode of a larger code. The 
effective length of the subcode is four, while the length of the code is six - the last two coordinates are 
fixed to zero. 

Since a subcode of a linear block code that contains 2* codewords, with s > 2, has the property 
that the total number of zeros and ones for each coordinate in the effective support is 2^~^, one can 
partition the set of codewords into two subsets with 2^~'^ zeros and ones per effective coordinate. 
Consequently, summing up the BPSK images of the words in these two partitions will produce identical 
vectors. Arbitrary translations of those vectors cannot be distinguished by any compressive sensing 
reconstruction algorithm. In the example above, the sum of the first two codewords equals the sum of 
the last two vectors. 

The submatrix of $ induced by codewords corresponding to a subcode contains sub-blocks of identical 
entries (equal to +1) at positions outside of the effective support of the subcode. As a consequence, if 
there exist many subcodes of small effective support size and relatively large dimension, the matrix $ 
will not be a Bernoulli-like matrix in a local sense. In the example above, two out of six coordinates 
have a fixed value equal to one - and consequently, there exists a 2 x 2 submatrix of all ones in the 
induced sensing matrix. 

Consequently, the "quality" of a code for binary vector CS can be partly characterized in terms of 
the support weight distribution: there should be as few as possible subcodes in the code of small 
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dimension and small effective support weight. Note that in addition to subcodes, there may exist other 
"indistinguishable" collections of codewords, but these are in general extremely hard to characterize for 
a given LDPC code. 

None of the problems described above are encountered for the case of real- valued sensing vectors, since 
the probability of having the same weighting coefficients for two different subsets of columns is negligible. 
There exists a large body of work regarding the higher weight and support weight distribution of codes 
(see |35| and references within) , which provide information on the smallest support of a subcode of given 
dimension and the number of such subcodes. In the lemma that follows, we use dr to denote the r-th 
generalized Hamming weight of C, i.e., the size of the smallest support of an r-dimensional subcode of 
C. Additionally, -^(7) denotes Shannon's binary entropy function with parameter < 7 < 1. 

Lemma 1 JJU^ Let the rate of a family of codes satisfy R < 1 — 2"'' and let min{7 log2(2'' — 1); r(7 + 
(r — l)/n)} < r(l — R) — H{'~f), for some non-negative integer parameter r. Assume that the codelength 
m of the codes is sufficiently large. Then almost all [m,mR] codes C have dr{C) > 7m. 

The following proposition provides a necessary condition on the rate of an LDPC code to satisfy the 
binary vector RIP property with 52K < — 1, which guarantees exact recovery of i^-sparse signals 
under LP reconstruction |41) . 

Proposition 3 Consider an LDPC code-based sensing matrix $ and a sparse binary input signal. The 
necessary condition on the rate of the corresponding LDPC code for $ to satisfy the binary vector RIP 
property with parameter 5k < — 1 is 

^ k' log.,(K) K 

Proof of Proposition 3: We consider a binary vector x whose K = 2'' non-zero entries are at positions 
indexing K columns of $ that form a translation of an r-dimensional subcode of C . Since for a subcode 
with effective length d^, m — dr coordinates are equal to zero, it follows that 

||*x||| = {m-dr)K'^ < (1 -7)mK2. 

Then, for $ to satisfy the binary RIP property with parameter 5k, one needs 

(1-7)^2 < {1 + 5k)K, 

or 

7>1--^. 

Thus, we require 

7>1-^. (5) 
Invoking the conditions of the Lemma [H one can enforce 

7log2(i^ - 1) < log2(K)(l -R)- H{j), 
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or equivalently, 

log.2(A- - 1) H(',) 

From ([5]) and one can obtain necessary conditions on the LDPC code rate R that ensure that the 
sensing matrix satisfies the binary RIP property with parameter 5k < — 1 as stated in Proposition 
[3l From the previous result, it can be seen that the code rate needs to be very small in order to ensure 
that a matrix with sufficiently strong RIP constant exists. However, this requirement is only a necessary 
condition for the binary RIP property, while the RIP property is a sufficient condition for exact CS 
reconstruction. In fact, the code-based sensing matrix still performs very well for much higher code 
rates, as shown in the Simulation Results section. 



4 Algorithms 

One of the main steps of greedy reconstruction algorithms is to identifying one (or multiple) columns 
of $ that exhibit maximum correlation with the vector y. Usually, such a step is implemented via 
exhaustive search, which meant that at each iteration of the procedure, one needs to compute inner 
products and order them. The idea behind the CS technique based on LDPC sensing matrices is that 
columns of maximum correlation correspond to the most likely codewords, and can hence be efficiently 
identified using iterative decoding methods such as BP [17j. It is important to point out that BP 
decoders in this context operate in a vastly different regime than standard decoders: they must be 
able to handle high interference noise and potentially identify lists of most likely codewords for use 
in subspace pursuit techniques. Hence, BP methods have to be generalized to accommodate for signal 
interference through biasing and reinforcement. 

The main contributions of our work in the context of reconstruction algorithm design are twofold. 
First, we propose a novel list decoding BP approach for binary sensing vectors. The idea behind 
the list decoder is to "bias" the components of the measurement vector y in order to identify the 
most likely columns of the sensing matrix to have contributed to the observation. This biasing is 
coupled with a model for codeword interference borrowed from the theory of multiuser communication 
systems |42| . Second, we present the first modifications of M best codewords decoding, multiple-basis 
BP and reinforced BP in the context of compressive sensing reconstruction. All these algorithms are of 
independent interest in iterative decoding theory for interference channels. 

In what follows, we start with a brief introduction of the BP algorithm and the reinforced BP algorithm. 
We then proceed to describe the list decoding BP method for binary sensing vectors. We conclude 
the section by describing how to adapt three generalizations of the BP algorithm to real-valued CS 
reconstruction. 



4.1 The BP Algorithm 

Decoding of LDPC codes is standardly performed in terms of iterative algorithms that exchange reliabil- 
ity messages between the variable nodes and the check nodes of a bipartite Tanner graph representation 
of the code. Such algorithms are collectively known as a message passing (MP) algorithms, and they 
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include the BP algorithm, and in particular, the sum-product algorithm. For completeness, we describe 
the sum-product algorithm of |43| below. 

Throughout this section, we use ^(H) to denote a suitably chosen bipartite Tanner graph of an LDPC 
code C. The columns of H are indexed by the elements of the set of variable nodes (left hand side 
nodes), Vc, of ^(H), while the rows of H are indexed by the set of check nodes (right hand side nodes) 
Vn of g(H). 

Let us denote the set of variable nodes neighboring a check node indexed by a by J\f{a) = {b : Hab = !}• 
Similarly, denote the set of check nodes neighboring a variable node indexed by b hy Ai(b) = {a : Hab = 1}; 
the notation Ai(b)\a is reserved for the set M{b) excluding the check node a. 

During BP decoding, two types of probability messages Qab^x,, and rab^x,, exchanged between a check 
node a and a variable node b if and only if the parity-check matrix has a nonzero entry at the respective 
position, i.e., if and only if Hab = 1- Here, the message qab,xi, represents the probability that variable 
b of the transmitted codeword is Xb {xb G {0, 1}) given the information provided by all check nodes 
incident to variable node b, excluding a and the channel output corresponding to the variable Xb- The 
message rab,x,, represents the probability of check node a being satisfied, provided that the variable b 
has value Xb while all other variables incident to the check node have a separable distribution. The 
probabilities (lal^^, where the superscript (/) indicates the iteration index, are initialized to the prior 
channel probabilities, according to q^abo ~ Pbfl — P{xb = 0), and q^abi ~ Pb,i — Pi^b = 1) = 1 — Pb,o- 
The horizontal step of the l-th. iteration of the algorithm is initiated by computing the probabilities 
r^ab X,,- This is accomplished in terms of the fast implementation approach described in |43| : first, 
S^ab = f'^ab ~ '^ab 1 Computed according to 

srab = ( n (^it:o^ - ^1^:1)) I (^K^ - 1'^)- (7) 

The horizontal step is terminated by computing r^fe q = + <^^afe) and Vab.x — \^~ S^ab)- 
During the vertical step of the l-th iteration of the BP algorithm, the values of qab, o and g^fe, i are 
updated as follows. First, the pseudoposterior probabilities of Xb being zero or one are found according 
to 

(0 TT (0 

Qb'o =abPb,o [[ r'Jo 

beM{b) 

=abPb,i n ^Si' (^) 

beM{b) 

where Of, is chosen so as to ensure + qjl\ = 1. The probabilities described above are used for 
tentative decoding. The algorithm stops if the hard-decision values of the pseudoposterior probabilities 
represent a valid codeword of the LDPC code, i.e., a word c such that Hc"^ = 0. 

Otherwise, from the pseudoposterior probabilities, the values of qab,o and qab,i are calculated according 
to the equations: 

1ab,0 - OiabPb,0 Qb,o/^ab,0 



(0 (0 / (0 few 

Tail = %iraLv W 
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where Oab is chosen so that q^^ q + q^^ i = 1- These probabihties are passed on to the horizontal step 
of the next iteration. 

4.2 Reinforced BP 

The BP algorithm is known to efficiently reach a near-optimal solution whenever the variables are 
"biased" or directed towards such a solution. The idea behind reinforced BP is to provide an additional 
bias towards the correct solution by combining properly weighted messages from the past and the 
present |22| . The Tanner graph of the code and the reinforced decoder contains an additional level of 
nodes, the reinforcement nodes, that serve to store messages passed on by the algorithm in the past. We 
restrict our attention to memory-one reinforcement strategies, were only messages from one previous 
iteration are stored. The reinforced BP algorithm is described in detail below. 

RBP modifies the messages from variable nodes to check nodes: the idea is to introduce an extra term 
into ([9]). This is accomplished as follows. Define the marginal function of variable b at iteration / as 

In RBP, the horizontal step ([7]) does not change compared to standard BP, whereas the vertical step 
([UD is modified as described below: 

iL,=«a.(5«(6))^(SM. Q^XL- (10) 

Here, Xf, G {0, 1}, aab is a normalization factor and : [0, 1] — t- [0, 1] is a non-decreasing function, 
often chosen to be of the form 

where V'O) V'l ^re in [0, 1]. The speed of convergence of reinforced BP depends on the values of ^{l), and 
therefore, these parameter have to be carefully chosen. 

4.3 List Decoding BP, BP-OMP, and BP-SP for Binary Vectors 

We focus our attention on a new class of BP decoders for binary vectors x only. The BP decoder has 
the purpose to identify the column of $ that maximizes the correlation with the measurement vector 
y. In this setting, two observations are at place. 

First, when K = 2, both columns (codewords) of $ used in the superposition have the same correlation 
with y. This is a particularly difficult setup for BP decoding, since the measurement vector y may 
fall in the middle of the decision regions for the two codewords, but not converge to any of these two 
codewords. This motivates analyzing strategies for biasing the estimates for certain coordinates in y so 
as to move the vector away from the boundary region. Biasing may be performed using reinforced BP, 
but as we show shortly, there is a much simpler and more efficient way to perform this biasing directly 
when the sensing vector is binary. 

Second, whenever one tries to identify the column with largest correlation, all the remaining columns 
in the superposition act as interference. For large K, this interference may be very severe. Hence, the 
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question of interest is to find an adequate model for the interference and investigate the performance 
of BP decoders for a very non-standard operational regime - namely, a regime significantly below the 
code's designed threshold. 

We start with the description of the biasing scheme for binary vectors. For illustrative purposes, we first 
describe the scheme for the case K = 2. We observe that for if = 2, the entries of the measurement 
vector y must lie in the set {—2, 0, 2}. Whenever yi = +2 or -2, both entries of the two columns of 
* in the superposition must be +1, or —1, respectively. For zero entries of y, one column must take 
the value —1 and the other must take the value +1 at that given coordinate. Therefore, one simple 
idea is to bias the ±2 entries toward some large value, ideally ±00, to ensure that the BP algorithm 
decodes the corresponding bits correctly. One can further bias one of the remaining zero entries of y 
either towards a large positive or negative value in order for the BP algorithm to decode it to +1 (or 
— 1), respectively. 

For the more general case, whenever yi = +K or —K, one can bias the corresponding entries towards 
some large positive or negative value, respectively. When y^ 7^ ^K, at least one of the columns must 
have the value +1 and at least one other column must have the value —1 at that coordinate. Note 
that biasing of values 7^ ibif may be problematic, since cither of the two values +1 and —1 is equally 
plausible for the ML codeword. Consequently, one should run the BP algorithm for different choices of 
biasing coordinates, which both increases the complexity of the scheme and makes the search list larger. 
Consequently, we restrict our attention to biasing ztif and ^{K — 1) values only, and limit the later 
biasing to only a few randomly chosen entries satisfying the required constraint. This leads only to a 
constant increase in complexity of the scheme. The generated list of potential columns (codewords) has 
to be expurgated from vectors that correspond to words obtained by divergent BP runs and repeated 
vectors. If the output list has more than K codewords, one outputs only K codewords from the list 
that have largest correlations with y. Otherwise, if the list has less than K entries, one outputs the 
whole list augmented by a set of randomly chosen columns that make the list of size K. 
As stated earlier, whenever one wants to identify a column (p^ in the superposition, the remaining 
columns act as interference. 

The algorithm outlined above is summarized in Algorithm 3. The list decoding BP algorithm can be 
combined with standard OMP reconstruction techniques, as summarized in Algorithm 4- Furthermore, 
the SP algorithm can be easily modified to include the biased list-based BP decoding strategy as its 
correlation maximization step so as to reduce its overall complexity. 

The performance of the BP list decoder can be assessed in several different ways. One way is to 
declare reconstruction success if at least one of the columns of $ has been identified correctly. The 
second measure can be the probability of identifying all columns of $ within the given superposition. 
The larger the size of the list, the larger the probability of finding the correct codewords in the list. 
However, increasing the list size also means larger computational complexity, since one has to run the 
BP algorithm for each element in the list. 
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Algorithm 1 Biased List-based BP (BLBP Algorithm 

Input: K, H, y, biasing list size L, large, a large positive biasing value B. 

• Construct the list RX = (rx^-"^), rx^^^, . . . , rx^^^), where 

— rx^^) = y 

— rx(^) is the vector y with entries ±K reset to ±B 

— Randomly generate an index set J of size (L — 2)/2, where rxj = ziz{K — 1), Vj G J. 

— For each j £ J', bias the j*^ coordinate of the vector rx^^^ to +B and —B to get rx^^'^^^ 
and rx(2'+2)^ respectively, (/ = 1, 2, . . . , (L - l)/2). 

• Run the BP algorithm for rx^*) and output a binary column vector Vj = 1, . . . , L). Delete all 
the words which do not satisfy Hvj = and delete all the repetitions in the list. 

• Output at most K words Vj whose BPSK images have largest correlations with y. 
Output: A potential list of columns of $ that contributed to y. 

It is straightforward to see that the computational complexity of the BP-OMP algorithm equals 
0{KLm). This follows from the fact that BP runs in time proportional to the length of the code, 
provided that the number of iterations is fixed. For fixed list size L, which is the setup we used, the 
complexity equals O [Km). It is worth pointing out that BP is a suboptimal algorithm, and that its 
exact analytical performance characterizations is still not known. 

Algorithm 2 BP-OMP Algorithm 
Input: K, H, y, L, B. 

Initialization: rx = y 

Iteration: Run the iteration K times 

• Run the Biased list-based BP algorithm and output a list of potential columns 

• Pick the codeword v whose BPSK image has largest correlation with rx 

• Update: K = K- I, r^ = y- BPSK(v) 
Output: The list of K columns of $ in the superposition. 

4.4 List Decoding BP Methods for Gaussian Sensing Vectors 

We describe next how to implement list decoders for real- valued sensing vectors. The biasing methods 
we described in section 2] can not be used for vectors x that are non-binary, due to the fact that biasing 
is contingent on the vector y having bounded integer values. To mitigate this problem, we propose using 
three different approaches, previously described in the context of graphical model analysis only. All 
these methods may be seen as schemes that bias the BP decoder performance in a way that allows one 
to generate multiple candidates for the ML codeword. The methods are the M Most Probable Config- 
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urations (MMPCs) algorithm |44| . the multiple-basis belief propagation (MBBP) algorithm introduced 
by one of the authors in [45| . and the previously described reinforced BP algorithm. 
The idea behind the MMPCs algorithm can be summarized as follows. First, the most probable con- 
figuration xi is found by running the BP algorithm. The symbol of largest marginal probability is 
consequently frozen to its complement. In the second step, the algorithm performs a search for the next 
best set of marginal values with the frozen symbol, and then recalculates the marginal probabilities. 
Freezing the symbol value allows one to avoid finding the most probable configuration once again in 
the second round of decoding. The output of the second step of the decoding process is denoted by 
X2. Similarly, x^ is found in such a way that it differs from xi,X2, . . . ,Xfc_i in at least one coordinate. 
The interested reader is referred to |44| for more details regarding this algorithm and several illustrative 
examples. The steps of the method, referred to as the MMPCs algorithm, were adapted from [54] and 
are summarized in Algorithm [3l 

On the other hand, the MBBP algorithm utilizes several parity-check matrices in parallel to identify 
columns of $ that have largest correlation with y. The different parity-check matrices are carefully 
chosen so as to bias the decoders in as many different directions leading to a number of near-optimal 
solutions |45j. This can be attributed to the fact that parity-check matrices with different degree 
distributions are employed, which may perform differently for the same interference model. 
The steps of the MBBP algorithm adapted for CS are summarized in Algorithm 4. 
Note that both the MMPCs and MBBP algorithms may be applied to binary sensing vectors as well. 
The computational complexity of the algorithms is 0{Mm) and 0{Lm), where in the latter case, L is 
used to denote the number of parallel decoders (in analogy with the biased BP algorithm for binary 
vectors where L denotes the number of runs of the BP algorithm). Note that MBBP is performed in 
parallel for all decoders, and theoretically takes only 0{m) operations, but its storage complexity is L 
times higher due to the need for storing multiple parity-check matrices. 

Finally, the MMPCs and MBBP algorithms may be combined with the OMP and SP algorithm for 
compressive sensing reconstruction. One problem associated with the use of these algorithms is that 
each run of MMPC may produce less than K vectors since the BP algorithm may not converge. To 
solve this problem, one may randomly pick columns of $ to fill up the list, as described for binary 
sensing vectors. 

5 Simulation Results 

Throughout our simulation section, we follow a common practice in coding theory to illustrate the 
performance of different decoding/CS reconstruction algorithms on one fixed code. In our simulations, 
the code is a length 160 progressive edge growth (PEG) row-regular LDPC code |46j . The dimension 
of the code is s = 10, resulting in a coderate of 1/16. In order to highlight the drawbacks and the 
advantages of different algorithms, we will describe our findings for other code parameters as well. For 
example, we may change the length of the code, or change its row- or column-degree. 
In all our simulation, we model the interference as a zero-mean Gaussian variable with variance cj^ = 
max(|y|)(K-l)/K. 
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Algorithm 3 The MMPCs Algorithm 



Input: K, H, y. 
Initialization 



SCOREi(i,;) = max Pr(X = x|y), 

x:x(i)=j 

xi(i) = argmaxSCOREi(z, j), 
j 

CONSTRAINTSi = 0, 
USED2 = 0. 



Iteration: For k = 2 to K: 



SEARCHjfc = {i,j,s<k:xs{i)^j, 

0USEDfc), 

{h,jk,Sk) = arg max SCORE^.(i, j), 

(i,j,s)eSEARCHfc 

CONSTRAINTS^ = CONSTRAINTS^;, U {(x(ife) = jk)}, 
SCOREfe(z,j) = max Pr(X = x|y), 

^ x«=j, CONSTRAINTS., ^ 

Xfc(i) = argmaxSCOREi(i, j), 

i 

USEDfe+i = USEDU{(4,jfc,Sfc)}, 
CONSTRAINTS^, = CONSTRAINTS^, U {(x(ifc) / j^)}, 
SCORE,,(z,j) = max Pr(X = x|y). 

x(i)=j, CONSTRAINTS^, 



Output: K vector with highest correlation with y 



Algorithm 4 MBBP Algorithm 



Input: K, Hi, H2, . . . , Hp, y 

• For i = 1 to p: Run the standard BP algorithm on y using the matrices Hj, i = 1, . . . , L, and 
output a word Vj for each decoder if it converges 

• Delete all the column vectors which do not satisfy Hvj = 

• Output the codewords whose BPSK image have the highest correlation with the received vector 
Output: One potential column of $ in the superposition. 

We start with an analysis of the performance loss of LDPC code based sensing matrices, as compared to 
random-like Bernoulli matrices. For our simulations, we generated two types of parity-check matrices. 
One class of matrices is generated randomly with uniform row-degree four and with columns of degree 
four or three (row-regular matrices) and the other class is generated randomly with uniform column- 
degree three and with rows of degree three and four (column-regular matrices) . One can easily see that 
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with row-regular parity-check matrices, the aU-one codeword is in the codebook. Therefore, there exist 
codewords Ci, C2 in the codebook such that Ci -I-C2 = 1, or equivalently, codewords whose BPSK images 
cancel out. To avoid this problem, for row-regular parity check matrix, we only use the codewords 
whose first coordinate is zero in order to form the sensing matrix This coordinate is subsequently 
deleted from all the codewords. Hence, in this case, the code originally had dimension 11, and was 
subsequently shortened. 

Figures [T] a) and b) illustrate the performance of the standard OMP and SP reconstruction algorithms 
applied to sensing matrices constructed using column-regular LDPC codes or row-regular LDPC codes. 
The performance plots are given for both binary and Gaussian sensing vectors. Figure [2] shows the same 
performance curves for a randomly generated sensing matrix of the same dimension, using realizations 
of i.i.d. Bernoulli random variables. 



^ 10"^ OMP performance, matrix size 160x1024, 10000 samples 

— ^ — Column regular, binary 
— e — Row regular, binary 

- * - Column regular, Gaussian 

- H — Row regular, Gaussian _^ 



SP performance, matrix size 160x1024, 10000 samples 
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— Column regular, binary 
} — Row regular, binary 

- Column regular, Gaussian 

- Row regular, Gaussian 




10 12 14 16 18 20 



(a) Performance of the OMP algorithm with LDPC code (b) Performance of the SP algorithm with LDPC code 
based sensing matrices based sensing matrices 



Figure 1: Reconstruction error probability of the standard OMP (a) and SP |(b)| algorithms. 



The findings are rather interesting: they indicate that, at least with respect to the cut-off density 
(defined as the smallest value of K for which the reconstruction error is not confined below some small 
threshold probability) the LDPC code based sensing exhibit almost no performance loss compared to 
the random like matrices. For example, the cut-off for binary sensing vectors under OMP and SP 
reconstruction for random-like sensing matrices are equal to 8 and 18, respectively. For Gaussian 
signals, these thresholds change to 12 and 27, respectively. For LDPC code based matrices, the cut-off 
thresholds are only slightly smaller: 7 and 19, and 11 and 24, respectively. Note that row-regular codes 
seem to slightly outperform column-regular codes for this particular example, although we did not notice 
a general trend in performance gain/loss that can be attributed to this particular code characteristic - 
sometimes column-regular codes perform better than row-regular codes, and sometimes the opposite is 
true. 

Next, we present simulation results for different BP correlation maximization strategies, coupled with 
the OMP and SP algorithm. The simulation results show that non-systematic LDPC code sensing 
matrices do not perform well under CS reconstruction when K is small. However, their systematic 
forms - forms given by = [I|P], where I denotes the identity matrix of size (m — s) - of the matrices 
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Figure 2: Performance of the OMP and SP algorithms with random sensing matrices 



have very good performance. This suggests that in the high interference regime, systematic parity-check 
matrices work better than non-systematic ones. This may confirm the recently described results of |16| . 
which can be very roughly stated as follows: for very low SNR regimes, "bad" LDPC codes outperform 
"good" ones. 



Figures 3(a) and 3(b) illustrate the performance of the SP algorithm with a BP correlation maximization 
component and of the MMPCs-OMP algorithm. In both case, we used systematic forms of row-regular 
and column-regular parity-check matrices, resulting in sensing matrices of size 160 x 1024. For the 
BP-SP algorithm, we used a list of size L = 20. The reconstruction error probability is defined as the 
probability of not finding all the correct columns in the output list. OMP algorithms combined with 
BP methods for identifying codewords with largest correlations exhibit slow convergence, which does 
not make them amenable for CS applications. 



BLBP+SP and MMPCs+SP algorithm, matrix size 160x 1024, 5000 samples 




Performance of OMP + MBBP algorithm, matrix size 160x1024. 5000 samples 



— X — Row regular, binary 
— e — Column regular, binary 

- ^ - Row regular, Gaussian 

- e - Column regular, Gaussian 




(a) Simulation results for the SP algorithm using the (b) Reconstruction error probability for the OMP al- 
biased list-based BP decoder and the MMPC algorithm gorithm using MBBP as its correlation maximization 
as its correlation maximization step. step. 



Figure 3: Performance of list-decoding algorithms coupled with OMP and SP. 



21 



RBP vs BP. matrix size 1 60(1 024, 5000 sampies 




Figure 4: Reconstruction error probability of the OMP algorithm for binary and Gaussian sensing 
vectors, using BP or RBP as its correlation maximization step. 



Figure [3(b)| shows the simulation results for the OMP algorithm using MBBP as its correlation maxi- 
mization step for both binary and Gaussian signals. We used four parity-check matrices (p = 4): one 
of them is either row-regular matrix or column-regular, while the remaining three parity-check matri- 
ces represent systematic forms of the original matrix, obtained via Gaussian elimination applied with 
different column orders. The reconstruction error probability is once again defined as the probability 
of not finding all the correct columns in the output list. As can be seen, the MBBP method performs 
well for Gaussian signals x, although binary vectors still represent a reconstruction challenge. 
Figure m compares the performance of the MBBP algorithm when using RBP (MBRBP) instead of the 
standard BP algorithm (MBBP). The parity-check matrix used for these simulations is column-regular. 
For the simulations, the parameters were set to ipo = 0.8 and ipi = 0.99. One can see that for small 
values of K, the standard BP algorithm performs slightly better than the RBP algorithm. However, 
RBP performs much better than BP for larger values of K. 



Finally, we present simulation results for codelengths that differ from 160. Figure 5(a) illustrates 
the performance of a large dimensional sensing matrix (256 x 65536) constructed using LDPC codes. 
Figure [5(b) [ illustrates the influence of the choice of column- and row-regular matrices on the performance 
of the sensing scheme for a parity-check matrix based on a length 144 LDPC code. In Figure 5(b) , the 
reconstruction algorithm used is the list-decoding SP method. 



6 Conclusions 

We described a new method for structured design of compressive sensing matrices based on LDPC 
codes. The special structure of the matrices makes them amenable for low-complexity reconstructions 
of supersparse signals via new variants of BP-based decoding. We presented both theoretical results for 
the bounds on the coherence parameter and the RIP of such matrices, as well as simulation results for 
the BP based reconstruction methods. 
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(a) Performance of the OMP and SP algorithm for (b) Performance of the OMP and SP algorithm with list 
"short" LDPC codes based sensing matrices. decoding BP methods for "tall" LDPC codes based sens- 

ing matrices. 

Figure 5: Performance of the OMP/SP algorithms on large and small structured sensing matrices. 



6.1 Proof of Proposition [2] 

To prove Proposition [21 we need to prove two claims. 

1. When K,m,N are sufficiently large with m > cK^logN, max^ \ {xi,Xj)\ < ^ with overwhelm- 
ing probability. 

2. Conversely, for any e > 0, if K, m, N are large enough with m = O log , max^ | {xi, Xj)\ > 
2^ with probability at least 1/2. 

To prove the first part, we use large deviations technique and the union bound. Let us first fix i 7^ j. 
Note that ^{xi,Xj) = ^YlT=i-^i,k-^j,k where Xi^j. and Xj^^ a-re the A;*^ elements of Xi and Xj, 
respectively. Let = Xi^^Xj^i^.. Clearly, Z^'s, k = 1,2,- •• ,m, are independent Bernoulli random 
variables. The generic moment generating function of such variables is given by 



E 



1 f) 1 
-e^ + -e" 

2 2 



According to the large deviations principle |47) , one has 



1 / I 

lim — log Pr ( max — I (xi, Xj)\ > x 



-I{x) , Vx > 0, 



(11) 



where / (x) is the rate function, given by 

/ (x) = max x9 — log ( E 



max x9 — log ( — H — e ^ 
^ ' 2 2 



(12) 



We compute the derivative of x9 — log (^e^ + ^) with respect to 6 and set it to zero. The optimal 
9 needed to achieve the maximum in the definition of the rate function / (x) is given by 



1 , 1 + x 
-log- . 

2 1 — X 



(13) 
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Substituting the explicit form for 9* into the definition of / (x) gives 



I{x) 



1 



+ o[x^). 



(14) 



Hence, according to the large deviations principle in (jlip . and using 0^,(1) to denote a value vanishing 
with V, 

1 



Pr ( 



max — I (a;j, Xj)\ > x 



exp 



< exp 



-m 



-x^ + o{x^) (1 + 0,^(1)) 



-2m Qx^ + o (x^)^ 



when m is sufficiently large. We now apply the union bound for all possible 1 < i ^ j < N . One has 



Pr (max — 



< 



exp 



{Xi, Xj)\ > 



-2m Qx^ + o (x^)^ 



< exp [-mx^ (1 + o^. (1)) + 2 log A^] . (15) 

Set X — 27^"- Clearly, as K,m are sufficiently large and m > cii^^logA'^ for some constant ci, the 

probability in (fT5|) can be made as small as e"™"^^ for some constant C2. This proves the first part of 
the claim. 



The converse is proved as follows. Note that a lower bound for Pr ( max— \ {xi,Xj)\ > ) is obtained 
by fixing i = 1, i.e., 

Pr (max— \{x„Xj)\ > ) > Pr (max— \{xi,Xj)\ > ) . 
yt^jm 2K J \ ji^i m 2K ) 

It suffices to prove the lower bound is nontrivial. Clearly, ^ {xi,Xj) = ^ X^fcLi ^i,kXj,k = ^ X^fcLi 
where Z^s are independent Bernoulli random variables with parameter p = 1/2. The large deviations 
analysis in (jlip and (jl4p is still valid. More importantly, the variables ^ (a;i,a;j)'s are independent for 
different values of j. Hence, 



2K 



1 - Pr 



N-l 



2K 



1 M 1 \ 

m ' - 2KJ 



It suffices to prove 



Pr 



Af-l 



m ' ^'^ - 2K) - 2 

N-l ( 1 



(16) 



To simplify the notation, let -PgQg,! denote Pr (— |(a;i,a;j)| < 27^) and let Z denote the random 
variable ^ \ {xi,Xj)\. Note that 



log^goal = (^-l)logPM^<^ 



(iV-l)log(l-Pr(Z> — 
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When m is sufficiently large, Pr (Z > ^) is sufficiently small according to the large deviations principle. 
Hence, for sufficient large K and m, one has 




' = ' -- exp iy-^rn^ (1 + ok (1)) j 



Suppose that m = cK^ log A'^ for some constant c. 




No matter how large c is, as long as K,m,N are sufficiently large, we have c/K'^ < 1. Hence, 



for large K, m, N. The desired ()16p therefore holds. This completes the proof. 
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